Add agent skills for modelopt by kaix-nv · Pull Request #1011 · NVIDIA/Model-Optimizer

kaix-nv · 2026-03-09T23:29:46Z

What does this PR do?

Type of change: ?

Adds a Claude Code skill suite for interactive model optimization with ModelOpt. The skill guides users through an end-to-end workflow: optimize model with modelopt APIs, deploy on vLLM and benchmark speed, evaluate accuracy with NeMo Evaluator (nel).

Usage

Invoke the skill in Claude Code:

/ptq

Say which model you want to quantize and in what quantization spec, e.g. nvfp4 mlp only

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Signed-off-by: Kai Xu <kaix@nvidia.com>

copy-pr-bot · 2026-03-09T23:29:50Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-03-09T23:30:05Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9fa76179-adef-4044-b785-6346a8a75197

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch kaix/modelopt_agent

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can disable the changed files summary in the walkthrough.

Disable the reviews.changed_files_summary setting to disable the changed files summary in the walkthrough.

codecov · 2026-03-10T00:33:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.11%. Comparing base (a4fde49) to head (802d2cb).
⚠️ Report is 35 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1011      +/-   ##
==========================================
- Coverage   72.12%   70.11%   -2.01%     
==========================================
  Files         209      221      +12     
  Lines       23628    25459    +1831     
==========================================
+ Hits        17042    17851     +809     
- Misses       6586     7608    +1022

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO · 2026-03-10T06:40:38Z

Added a separate ptq skill, needs further tuning. Claude opus can follow the skill, but sonnet needs more guide.

Signed-off-by: Kai Xu <kaix@nvidia.com>

Signed-off-by: Meng Xin <mxin@nvidia.com>

Edwardf0t1 · 2026-03-12T21:16:38Z

@kaix-nv @mxinO This is a great starting point to use agent skills for modelopt workflows 👍 We should test it with various models and optimization recipes to polish the skills.

Copy nel-assistant skill as local evaluation skill so we can extend it to support optimized model evaluation requirements. Update modelopt orchestrator to reference the evaluation skill. Signed-off-by: Kai Xu <kaix@nvidia.com>

Add deployment skill (vLLM, SGLang, TRT-LLM serving) and update modelopt orchestrator to support three pipelines: - PTQ only - PTQ + Deploy (serve as API endpoint) - PTQ + Evaluate (accuracy benchmark) Signed-off-by: Kai Xu <kaix@nvidia.com>

Signed-off-by: Meng Xin <mxin@nvidia.com>

kaix-nv · 2026-03-13T22:43:34Z

@kaix-nv @mxinO This is a great starting point to use agent skills for modelopt workflows 👍 We should test it with various models and optimization recipes to polish the skills.

Thanks. The skills are still at an early stage, so it’d be great to get more people using them and giving feedback. Testing across a broader set of models and optimization recipes will help us iterate quickly and make the workflows more robust.

Add agent skills for modelopt

f83861f

Signed-off-by: Kai Xu <kaix@nvidia.com>

kaix-nv requested a review from mxinO March 9, 2026 23:30

mxinO added 2 commits March 9, 2026 19:38

restructure

a53bb59

Signed-off-by: Meng Xin <mxin@nvidia.com>

adding ptq skill

a3a2240

Signed-off-by: Meng Xin <mxin@nvidia.com>

kaix-nv added 2 commits March 10, 2026 15:59

Update paths

bd794bf

Signed-off-by: Kai Xu <kaix@nvidia.com>

Update paths

6968ad6

Signed-off-by: Kai Xu <kaix@nvidia.com>

kaix-nv force-pushed the kaix/modelopt_agent branch from 18eb9c2 to 6968ad6 Compare March 11, 2026 00:47

update ptq

22560c9

Signed-off-by: Meng Xin <mxin@nvidia.com>

kaix-nv force-pushed the kaix/modelopt_agent branch from bd2d3da to 4f61bad Compare March 12, 2026 23:13

Add evaluation skill and orchestrator

28928a1

Copy nel-assistant skill as local evaluation skill so we can extend it to support optimized model evaluation requirements. Update modelopt orchestrator to reference the evaluation skill. Signed-off-by: Kai Xu <kaix@nvidia.com>

kaix-nv force-pushed the kaix/modelopt_agent branch from 4f61bad to 28928a1 Compare March 12, 2026 23:17

kaix-nv force-pushed the kaix/modelopt_agent branch from 3a320f6 to 5c46798 Compare March 13, 2026 02:03

mxinO added 2 commits March 12, 2026 21:05

better slurm process

2eca533

Signed-off-by: Meng Xin <mxin@nvidia.com>

direct things inside reference

802d2cb

Signed-off-by: Meng Xin <mxin@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agent skills for modelopt#1011

Add agent skills for modelopt#1011
kaix-nv wants to merge 10 commits intomainfrom
kaix/modelopt_agent

kaix-nv commented Mar 9, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 9, 2026

Uh oh!

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

Review skipped

Uh oh!

codecov bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

mxinO commented Mar 10, 2026 •

edited

Loading

Uh oh!

Edwardf0t1 commented Mar 12, 2026 •

edited

Loading

Uh oh!

kaix-nv commented Mar 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kaix-nv commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Mar 9, 2026

Uh oh!

coderabbitai bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

codecov bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mxinO commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Edwardf0t1 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaix-nv commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaix-nv commented Mar 9, 2026 •

edited

Loading

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

codecov bot commented Mar 10, 2026 •

edited

Loading

mxinO commented Mar 10, 2026 •

edited

Loading

Edwardf0t1 commented Mar 12, 2026 •

edited

Loading

kaix-nv commented Mar 13, 2026 •

edited

Loading